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We introduce a class of interatomic potential models that can be automatically generated from 
data consisting of the energies and forces experienced by atoms, as derived from quantum mechanical 
calculations. The models do not have a fixed functional form and hence are capable of modeling 
complex potential energy landscapes. They are systematically improvable with more data. We 
apply the method to bulk crystals, and test it by calculating properties at high temperatures. Using 
the interatomic potential to generate the long molecular dynamics trajectories required for such 
calculations saves orders of magnitude in computational cost. 
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Atomic scale modeling of materials is now routinely 
and widely applied, and encompasses a range of tech- 
niques from exact quantum chemical methods [1] through 
density functional theory (DFT) [2] and semi-empirical 
quantum mechanics [3] to analytic interatomic potentials 
[4]. The associated trade-offs in accuracy and compu- 
tational cost are well known. Arguably, there is a gap 
between models that treat electrons explicitly, and those 
that do not. Models in the former class are in prac- 
tice limited to handling a few thousand atoms, while the 
simple analytic interatomic potentials are limited in ac- 
curacy, regardless of how they are parametrized. The 
panels in the top row of Fig. 1 illustrates the typical 
performance of analytic potentials in bulk semiconduc- 
tors. Perhaps surprisingly, potentials that are generally 
regarded as adequate for describing these bulk phases 
show significant deviation from the quantum mechani- 
cal potential energy surface. This in turn gives rise to 
significant errors in predicting properties such as elastic 
constants and phonon spectra. 

In this letter we are concerned with the problem of 
modeling the Born-Oppenheimer potential energy sur- 
face (PES) of a set of atoms, but without recourse to 
simulating the electrons explicitly. We mostly restrict 
our attention to modeling the bulk phases of carbon, sili- 
con, germanium, iron and gallium nitride, using a unified 
framework. Even such single-phase potentials could be 
useful for calculating physical properties, e.g. the ther- 
mal expansion coefficient, the phonon contribution to the 
thermal conductivity, the temperature dependence of the 
phonon modes, or as part of QM/MM hybrid schemes [7]. 

The first key insight is that this is actually practicable: 
the reason that interatomic potentials are at all useful is 
that the PES is a relatively smooth function of the nu- 
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FIG. 1: Deviation of atomic forces between DFT and various 
models: the Brenner [5] and Tersoff[6] potentials, and differ- 
ent GAP models for different semiconductors. In the bottom 
row the horizontal lines corresponds to the smallest standard 
deviation of the error theoretically attainable given the range 
of the potential (see text). The configurations are taken from 
molecular dynamics runs at 1000 K. 



clear coordinates. Improving potential modeling is diffi- 
cult not because the PES is rough, but because it does 
not easily decompose into simple closed functional forms. 
Secondly, away from isolated quantum critical points, the 
behavior of atoms is localized in the sense that if the total 
energy of a system is written as a sum of atomic energies, 
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where r^- = r j — is the relative position between atoms 
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i and j, then good approximations of E can be obtained 
by restricting the set of atoms over which the index j runs 
to some fixed neighborhood of atom i, i.e., < r cut . In 
fact, we take Eq. (1) with this restriction as the defining 
feature of an interatomic potential. Note that in general 
it is desirable to separate out Coulomb and dispersion 
terms from the atomic energy function, £({vij}), because 
the covalent part that remains can then be localized much 
better for the same overall accuracy. The strict localiza- 
tion of e enables the independent computation of atomic 
energies. However, it also puts a limit on the accuracy 
with which the PES can be approximated. Consider an 
atom whose environment inside r cut is fixed. The true 
quantum mechanical force on this atom will show a vari- 
ation, depending on its environment outside the cutoff. 
An estimate of this variance is shown on Fig. 1 by the 
horizontal lines: no interatomic potential with the given 
cutoff can have a lower typical force error. 

To date, two works have attempted to model the PES 
in its full generality. In the first [8], small molecules 
were modeled by expanding the total energy in poly- 
nomials of all the atomic coordinates, without restrict- 
ing the range of the atomic energy function. While this 
gave extremely accurate results, it cannot scale to more 
than a few atoms. More recently, a neural network was 
used to model the atomic energy [9] . Our philosophy and 
aims are similar to the latter work: we compute e({rjj}) 
by interpolating a set of stored reference quantum me- 
chanical results using a carefully constructed measure of 
similarity between atomic neighborhoods. We strive for 
computational efficiency in our use of expensive ab ini- 
tio data by using both the total energy and the atomic 
forces to obtain the best possible estimate for e given 
our assumptions about its smoothness. Furthermore, our 
scheme makes the generation of potential models auto- 
matic, with almost no need for human intervention in 
going from quantum mechanical data to the final inter- 
atomic potential model. In the following, we present an 
overview of our formalism. Detailed derivations are given 
in the Supplementary Information (SI). 

The atomic energy function is invariant under trans- 
lation, rotation and the permutation of atoms. One of 
the key ideas in the present work is to represent atomic 
neighborhoods in a transformed system of coordinates 
that accounts for these symmetries. Ideally, this map- 
ping should be one-to-one: mapping different neighbor- 
hood configurations to the same coordinates would in- 
troduce systematic errors into the model that cannot be 
improved by adding more quantum mechanical data. We 
begin by forming a local atomic density from the neigh- 
bors of atom i, as 



Pi (r) = <5(r) + ^5(r - ry) /cut (1^1 



(2) 



where f C ut(f) = l/2 + cos(7rr/r cut )/2 is a cutoff function, 
in which the cutoff radius r cut reflects the spatial scale of 



the interactions. The choice of cutoff function is some- 
what ad-hoc: any smooth function with compact support 
could be used. 

The local atomic density is invariant to permuting the 
atoms in the neighborhood. One way to achieve rota- 
tional invariance as well would be to expand it in spheri- 
cal harmonics and a set of radial basis functions and ap- 
propriately combine the resulting coefficients, similarly 
to how the structure factor is computed from Fourier 
components. However, just as the structure factor (a 
two-point correlation) is missing all "phase" information 
(the relative phases of the different plane waves), such 
a set of spherical invariants would lose a lot of infor- 
mation about the configuration of the neighborhood. In 
contrast, the bispectrum [10], which is a three-point cor- 
relation function, is a much richer system of invariants, 
and can provide an almost one-to-one representation of 
the atomic neighborhood. 

In our method we first project the atomic density onto 
the surface of the four-dimensional unit sphere, simi- 
larly to how the Riemann sphere is constructed, with 
the transformation 



<f> — arctan(y/a;) 
6 = arccos(z/|r|) 
#o = |r|/r 



(3) 



where r > r cut /n. The advantage of this is that the 4D 
surface contains all the information from the 3D spherical 
region inside the cutoff, including the radial dimension, 
and thus 4D spherical harmonics (also called Wigner ma- 
trices, U 3 m i m [11]) constitute a natural complete basis for 
the interior of the 3D sphere, without the need for radial 
basis functions. The projection of the atomic density on 
the surface of the 4D sphere can therefore be expanded in 
4D spherical harmonics using coefficients (dropping the 
atomic index % for clarity) 



C m'm (U m 'm\P) 



(4) 



The bispectrum built from these coefficients is given by 



31 



32 



B 



■•■32,3 ( C m'm) 
mj,mi=- ji Tn' 2 ,m2=—j2 rn' ,m——j 

(jjm Qjm' Ji J2 

jim 1 j 2 m 2 jim[j 2 m' 2 m^mx m' 2 m 2 



(5) 



where C J ™L „■ „, arc the ordinary Clcbsch-Gordan coefh- 
cients. The elements of this three-index array, which we 
will denote by for atom i, are invariant with respect 
to permutation of atoms and rotations of 4D space, and 
hence also 3D space. In practice, we use only a truncated 
version, with < J max , corresponding to a limit in 

the spatial resolution with which we describe the atomic 
neighborhood. 

Determining the PES is now reduced to interpolat- 
ing the atomic energy in the truncated bispectrum-space, 
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and for this we use a non-parametric method called Gaus- 
sian Process (GP) regression [12, 13]. In the GP frame- 
work, assuming Gaussian basis functions, the best esti- 
mate for the atomic energy function is given by 

£ (b) = ]T a ne -§ E,[(6.-6»,)/«.] a = J2 * n G(b, b„), 

n n 

(6) 

where n and I range over the reference configurations 
and bispectrum components, respectively and are 
(hyper) parameters. The GP is called a non-parametric 
method because the kernels G are not fixed but centered 
on the data, and hence, loosely, any continuous function 
can in principle be obtained from Eq. (6) [14]. 

The GP differs from a simple radial basis function 
least-squares fit in the way the coefficients a n are com- 
puted. The covariance, i.e. the measure of similarity, of 
the reference configurations is defined as 

C nn , = 5 2 G(h,h') + o 2 l (7) 

where 5 and a are two further hyperparameters and I 
is the identity matrix. The interpolation coefficients are 
then given by 

{a n } = a = C-V, (8) 

where y = {y n } is the set of reference values (quantum 
mechanical energies). This simple expression for the co- 
efficients is derived in detail in [13]. Thus Eq. (6) gives 
the atomic energy function in closed form as a function 
of the quantum mechanical data. 

In addition to preserving exact symmetries, another 
hurdle is that although we wish to infer the atomic energy 
function, the data we can collect directly are not values 
of atomic energies, but total energies of sets of atoms, 
and forces on atoms, the latter being sums of partial 
derivatives of neighboring atomic energies [15]. Further- 
more, our data will be heavily correlated: e.g. the neigh- 
borhoods of atoms in a slightly perturbed ideal crystal 
are very similar to each other. Both of these problems 
are solved by applying a sparsification procedure [16], in 
which a predetermined number (much smaller than the 
total data size) of "sparse" configurations are chosen ran- 
domly from the set of all configurations and the data val- 
ues y in Eq. (8) are replaced by linear combinations of 
all data values. The models in this work used 300 such 
sparse configurations. The final expression for the model, 
which we call GAP, is derived in the SI. 

All the DFT data in this work was generated with the 
Castep package [17]. The reference configurations were 
obtained by randomly displacing the atoms and the lat- 
tice vectors from their equilibrium values in 2, 8, 16 and 
64-atom cubic unit cells by up to 0.2 A. 

The lower panels of Fig. 1 show the performance of 
the GAP model for semiconductors in terms of the accu- 
racy of forces for near-bulk configurations. For diamond, 
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TABLE I: Table of relaxed diamond surface energies in 
J/m 2 (top) and elastic constants, in units of GPa (bottom). 




Temperature IK 

FIG. 2: a) Phonon dispersion curves for diamond using 
the GAP model (lines), DFT (triangles) and experiment 
(squares) [18]; b) temperature dependence of the T25' mode 
from MD using GAP (circles, 250 atoms, 20 ps) and experi- 
ment (squares) [19] . In accordance with common practice [20] 
the calculated points have been shifted by a constant to agree 
with experiment at zero temperature to account for the anhar- 
monic effects of zero-point motion and a quantum correction 
to the kinetic temperature is also applied[21]; c) phonon dis- 
persion of iron using GAP (solid), Finnis-Sinclair potential 
[22] (dotted) and DFT (triangles). 



the GAP model is shown to improve significantly as the 
cutoff is increased (there is also systematic improvement 
as J max is increased, this is shown in the SI). For all 
three materials the RMS errors in the energy are less 
than 1 meV/at. Table I shows the elastic constants. It 
is remarkable that the existing potentials are not able 
to reproduce all elastic constants to better than 25% for 
any setting of their parameters. Fig. 2 shows the phonon 
spectrum for diamond and iron. For diamond, the GAP 
model shows excellent accuracy at zero temperature over 
most of the Brillouin zone, with a slight deviation for 
optical modes in the A direction. The agreement with 
experiment is also good for the frequency of the Raman 
mode as a function of temperature. For iron, the agree- 
ment with DFT is even better. We also computed the 
linear thermal expansion coefficient of diamond, shown 
in Fig. 3, using two different methods, applicable at low 
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and high temperatures. Our low temperature curve is de- 
rived from the phonon spectrum via the quasi-harmonic 
approximation and agrees well with the DFT and ex- 
perimental results. At higher temperatures higher or- 
der anharmonic terms come into play, so we use molecu- 
lar dynamics (MD) and obtain good agreement with ex- 
periment, showing that the GAP model is accurate sig- 
nificantly beyond the small displacements that control 
phonons. 

Finally, we extended the GAP model by including 
reference configurations generated by random displace- 
ments around a diamond vacancy and graphite. Fig. 4 
shows the energetics of the transition path for a migrating 
vacancy in diamond and the transition from rhombohc- 
dral graphite to diamond. The agreement with DFT is 
excellent, demonstrating that we can construct a truly 
reactive model that describes the sp 2 -sp 3 transition cor- 
rectly, in contrast to currently used interatomic poten- 
tials. 

Even for the small systems considered above, the GAP 
model is orders of magnitude faster than standard plane- 
wave DFT codes, but significantly more expensive than 
simple analytical potentials. The computational cost is 
roughly comparable to the cost of numerical bond order 
potential models[23]. The current implementation of the 
GAP model takes 0.01 s/atom/timestep on a single CPU 
core. For comparison, a timestep of the 216-atom unit 
cell of Fig. 3 takes 191 s/atom using Castep, about 20,000 
times longer, while the same for iron would take more 
than a million times longer. 
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FIG. 3: Linear thermal expansion coefficient of diamond in 
the GAP model (dashed) and DFT (dash-dotted) using the 
quasi-harmonic approximation [24], and derived from MD (216 
atoms, 40 ps) with GAP (solid) and the Brenner potential 
(dotted). Experimental results are shown with squares[25]. 

In summary, we have outlined a framework for auto- 
matically generating finite range interatomic potential 
models from quantum-mechanically calculated atomic 
forces and energies. The models were tested on bulk 
semiconductors and iron and were found to have re- 
markable accuracy in matching the ab initio potential 
energy surface at a fraction of the cost, thus demonstrat- 
ing the fundamental capabilities of the method. Pre- 
liminary data for GaN, presented in the SI, shows that 
the extension to multi-component and charged systems 
is straightforward by augmenting the local energy with 
a simple Coulomb term using fixed charges. Our long 
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FIG. 4: The energetics of the linear transition path for a 
migrating vacancy (top) and for the rhombohedral graphite 
to diamond transformation. 

term goal is to expand the range of interpolated config- 
urations and thus create "general" interatomic potentials 
for one- and two-component materials whose accuracy 
approaches that of quantum mechanics. 
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1 Bispectrum 

An arbitrary function p defined on the surface of a 4D sphere can be numerically represented using the 
hyperspherical harmonic functions U 3 m , m {<j),9,8o). The hyperspherical harmonics form an orthonormal basis 
set thus p can be represented as 

oo j 

o = V V c 7 , u 3 , 

r / j / j m'm m'm' 
j—0 m,m' — —j 

The expansion coefficients c^/ m can be calculated via 

<? , =(U j , \p), 

mm \ m ml~/> 

where (.|.) denotes the inner product. For clarity, the vectors c 3 are constructed from the expansion coef- 
ficients c? m , m . A unitary operation R, such as a rotation, acting on p transforms the coefficient vectors c 3 
according to 

c' 3 = RV, 

where R 3 are unitary matrices, i.e. (R 3 ) R 3 = I. 

The direct product of two rotational matrices, R 31 and R 32 can be decomposed into a direct product of 
R 3 matrices by a unitary transformation 



R 31 <g> R 3 



3 1+32 
j=\jl-32\ 



where the matrix H 3132 is the four-dimensional analouge of the Clebsch-Gordan coefficients. In fact, the ele- 



ments of the matrix are obtained as product of Clebsch-Gordan coefficients: Hi 



1 m\m l ,L 2 m 2 m 2 



f-ilm riln 

l\mil 2 m 2 lim'-L^m^' 



The direct product of the coefficient vectors c 31 and c 32 transforms according to the direct product of the 
rotational matrices 
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We define gii,32,j — using the fact that H 3l < 32 is unitary — as follows: 



31+32 

e 



g3l,32,3 = H JU2 C J1 C J2 ? 



which transforms under rotation as g 3 ' 31 ' 32 — > R,JgJi'J2,j_ ^he cubic rotational invariants, also known as the 
bispectrum, can be constructed as 

Finally, we arrive to the expression for the bispectrum elements, computed as 



B]! ,j 2 ,3 



31 32 3 

EV \ " (J Yfjm gjm' Ji J 2 

/ j / j \ J m'mJ jimij 2 m2 3im' 1 j 2 m' 2 m\m\ m' 2 m2' 



The truncated version of the bispectrum results in a finite array, with 4, 23 and 69 elements for J max = 1, 
3 and 5, respectively. 
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2 Gaussian Process Regression 



Notation and formulae: 

Number of raw atomic neighbourhood configurations 
Number of sparse atomic neighbourhood configurations 
bispectrum of nth reference configuration 
bispectrum of mth sparse configuration 

bispectrum of configuration for which prediction is sought ( "test configuration" ) 
vector of data values at the raw configurations 

Covariance function, the measure of similarity of two configurations 
C(x„,x„<), covariance matrix of raw configurations 
C(x TO ,x m <), covariance matrix of sparse configurations 

[k„] m = C(x„,x m ), covariance matrix of sparse and raw configurations, Cmn = (Cnm) T 
C(x m ,x,), covariance vector of test and sparse configurations 
Diag(diag(CAr - CatmC^ Cmjv)) 
intrinsic noise of data values (hyperparameter) 

Cm + C MN (A + (t 2 I) _1 Cjvm7 pseudo-covariance matrix of the sparse configurations 
k^Q^ Cmjv(A + cr 2 I) _1 y, prediction of atomic energy for test configuration 
C(x»,x*) — k^C^/ — Q]^)k* + a 2 , variance of prediction for test configuration 

where diag(A) is the vector of diagonal elements of the matrix A, and Diag(v) is the matrix whose diagonal 
elements are the components of vector v and the off-diagonal elements are zero. 

2.1 Covariance function (kernel) 

We use a Gaussian kernel. This kernel enables us to assign a separate spatial scale hyperparameter to 
each element of the feature vector, therefore the kernel provides a more flexible description than a Gaussian 
kernel with a single hyperparameter. The next three equations show how the covariance function is evaluated 
if the function values are available for both configurations, if we have the derivative of the function available 
at one configuration, and if the derivatives are given for both configurations. 




2.2 Linear combinations 

In our case, only the linear combination of atomic energies can be directly observed. We cannot determine 
the atomic contributions to the total energy of a system uniquely from an electronic structure calculation. 
Similarly, the atomic forces — although they are available from first-principles calculations — are not derivatives 
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of atomic energies, but are sums of derivatives of different atomic contributions. It is possible to use a Gaussian 
Process to infer the underlying function even if only linear combinations of function values arc available. Now 
let y is the vector of K observed values (total energies and atomic force components). Let y' be the vector 
of ./V unobserved values of atomic energies and its derivatives corresponding to the N atomic neighborhood 
configurations. Let the N x K matrix L describe the relationship of the K observations to the N unknown 
values. The elements of L are Os and Is, and 

y = Ly' 

The covariance of the K observations is then given by 

Ckk — L t CtvtvL 



2.3 Putting it all together 

The sparsification and the linear combinations are used together to give the final expression for the atomic 
energy, in such a way that the unobserved values y' are not needed, 

A = Diag (diag(L T C7VArL — L t CjvmC a# 1 CmatL)) (now a K x K diagonal matrix) 
e, = knCM + C M ivL(A + a 2 I)- 1 L T C N M] _1 CMivL(A + a 2 I)- 1 y 

3 The data: density functional theory 

The DFT data was generated using the local density approximation in case of carbon and the PBE generalized 
gradient approximation for silicon, germanium, GaN and iron. The electronic Brillouin zone was sampled by 
using a Monkhorst-Pack k-point grid, with a k-point spacing of at most 0.3 A for insulators and 0.14A 
for iron. The plane wave cutoff was 350, 300, 300, 350, 500 cV for C, Si, Ge, GaN and Fe respectively and 
the energies were extrapolated to correct for the finite basis set. Ultrasoft pseudopotcntials were used with 

4 valence electrons for all group IV ions, 3 electrons for Ga, 5 electrons for N and 8 for Fe ions. 

4 Testing the GAP parameters 

Figure 1 shows the improvement in the distribution of force errors of the GAP model for diamond as the 
cutoff radius is increased. Figure 2 shows the same as J max is increased, which corresponds to increasing the 
spatial resolution of the bispectrum. 

5 Potential for gallium nitride 

Figures 3 and 4 show the force errors and the phonon spectrum of a simple GAP model for gallium nitride. 
The long range Coulomb interactions of the ions is significant in this system, so we augmented the local 
energy of the original GAP model by an Ewald sum of fixed charges (+1 for Ga and -1 for N). The Gaussian 
Process regression was carried out on forces and energies which were obtained from the DFT calculations by 
subtracting this Coulomb contribution. The LO/TO splitting in the phonon spectrum shows that the model 
captures the long range character of the ionic interactions correctly. 
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Figure 1: Force correlation of GAP models for diamond with different spatial cutoffs. 
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Figure 2: Force correlation of GAP models for diamond with different resolution of representation. The 
number of invariants were 4, 23 and 69 for J max =1, 3 and 5, respectively. 
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Figure 3: Force errors in GaN of the GAP model augmented with a simple Ewald sum of fixed charges with 
reference to DFT forces. 




Figure 4: Phonon spectrum of GaN calculated with GAP (solid lines) and PBE-DFT (open squares). 
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